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(57) ABSTRACT 

A gateway for screening packets transferred over a network. 
The gateway includes a plurality of network interfaces, a 
memory and a memory controller. Each network interface 
receives and forwards messages from a network through the 
gateway. The memory temporarily stores packets received 
from a network. The memory controller couples each of the 
network interfaces and is configured to coordinate the trans- 
fer of received packets to and from the memory using a 
memory bus. The gateway includes a firewall engine 
coupled to the memory bus. The firewall engine is operable 
to retrieve packets from the memory and screen each packet 
prior to forwarding a given packet through the gateway and 
out an appropriate network interface. A local bus is coupled 
between the firewall engine and the memory providing a 
second path for retrieving packets from memory when the 
memory bus is busy. An expandable external rule memory is 
coupled to the local bus and includes one or more rule sets 
accessible by the firewall engine using the local bus. The 
firewall engine is operable to retrieve rules from a rule set 
and screen packets in accordance with the retrieved rules. 

17 Claims, 10 Drawing Sheets 
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FIREWALL INCLUDING LOCAL BUS bus (e.g., PCI bus) 125. Memory controUer 124 is coupled 

to a memory (RAM) 126 and firewall engine 128 by a 

BACKGROUND OF THE INVENTION memory bus 129. Firewall engine 128 performs packet 

„ . . , „ , - screening prior to routing packets through to private network 

The present invention re ates generally to data rouUng ^ ^ ^cessor (CPU) 134 is coupled to memory 

systems, and more particularly to a method and apparatus for ^omroUer 124 by a CPU bus 132. CPU 134 oversees the 

providme secure commimications on a network. ^ ^ n u l \m 

* . . . , , memory transfer operations on all buses shown. Memory 

A packet switch commumcation system mcludes a net- controUer 124 is a bridge conncting CPU Bus 132, memory 

work of one or more routers connectmg a plurality of users. and PCI bus 125 

A packet is the fundamental unit of transfer in the packet „ , - ^ i ,• i ^^/^ ^ ^ 

switch communication system. A user can be an individual '° ^^^^^^ f ^'^^ ^""^^i? ^ ]20. Each 

user terminal or another network, A router is a switching P^^*^^^ ^ transferred on bus 125 to and routed through, 

device which receives packets containing data or control i^ ""V^J^ ^l^Z'"" "^^T^ 

information on one port, and based on destination informa- ^^9^ When firewall engine 128 is available packets are 

tion contained within the packet, routes the packet out fetched usmg memory bus 129 and processed b 

another port to the destination (or intermediary destination). ^^S. After processing by the firewall engme 128 the 

Convemional routers perform this switching function by P^^^,f is returned to RAM 126 usmg memory bus 129. 

evaluating header information contained within the packet in ^^^^^y* P^^^^^ ^^"^ ""^"l^^ controller 124 

order to determine the proper output port for a particular ^""^ 1^9, and routed to private network hnk 
packet. 

The network can be an intranet, that is. a network con- Unfortunately this type of firewaU is inefficient in a 

necting one or more private servers such as a local area number of ways. A majority of the traffic in the firewall 

network (LAN). Alternatively, the network can be a pubUc ^^i^i^es memory bus 129. However, at any time, memory bus 

network, such as the Internet, in which data packets are ^29 can aUow only one transaction. Thus, memory bus 129 

passed over untrusted communicaUon links. The network 25 ^^^^^^^ ^ bottleneck for the whole system and limits 

configuration can include a combination of public and system performance. 

private networks. For example, two or more LAN's can be The encryption and decryption services as well as authen- 

coupled together with individual terminals using a public tication services performed by firewall engine 128 typically 

network such as the Internet. When public and private are performed in series. That is, a packet is typically required 

networks are linked, data security issues arise. More 33 to be decrypted prior to authentication. Serial processes 

specifically, conventional packet switched communication typically slow performance. 

systems that include links between public and private net- A conventional software firewall can sift through packets 
works typically include security measures for assuring data when connected through a T-1 or fractional T-1 link. But at 
integrity. T-3, Ethernet, or fast Ethernet speeds software-based fire- 
In order to assure individual packet security, packet 35 walls running on an average desktop PC can get bogged 
switched communication systems can include encryption/ down. 

decryption services. Prior to leaving a trusted portion of a „ ™ 

network, individual packets can be encrypted to minimize SUMMARY OF TOE INVENTION 
the possibility of data loss while the packet is transferred In general, in one aspect, the invention provides a gate- 
over the untrusted portion of the network (the public 40 way for screening packets transferred over a network. The 
network). Upon receipt at a destination or another trusted gateway includes a plurality of network interfaces, a 
portion of the communication system, the packet can be memory and a memory controller. Each network interface 
decrypted and subsequently delivered to a destination. The receives and forwards messages from a network through the 
use of encryption and decryption allows for the creation of gateway. The memory temporarily stores packets received 
a virtual private network (VPN) between users separated by 45 from a network. The memory controller couples each of the 
untrusted communication links. network interfaces and is configured to coordinate the trans- 

In addition to security concerns for the data transferred fer of received packets to and from the memory using a 

over the public portion of the communications system, the memory bus. The gateway includes a firewall engine 

private portions of the network must safeguard against coupled to the memory bus. The firewall engine is operable 

intrusions through the gateway provided at the interface of 50 to retrieve packets from the memory and screen each packet 

the private and the public networks. A firewall is a device prior to forwarding a given packet through the gateway and 

that can be coupled in-line between a public network and out an appropriate network interface. A local bus is coupled 

private network for screening packets received from the between the firewall engine and the memory providing a 

public network. Referring now to FIG. la, a conventional second path for retrieving packets from memory when the 

packet switch communication system 100 can include two 55 memory bus is busy. An expandable external rule memory is 

private networks 102 coupled by a public network 104 for coupled to the local bus and includes one or more rule sets 

facilitating the communication between a plurality of user accessible by the firewall engine using the local bus. The 

terminals 106. Each private network can include one or more firewall engine is operable to retrieve rules from a rule set 

servers and a plurality of individual terminals. Each private and screen packets in accordance with the retrieved rules, 

network 102 can be an intranet such as a LAN. I^iblic eo Aspects of the invention can include one or more of the 

network 104 can be the Internet, or other public network following features. The firewall engine can be implemented 

having untrusted links for linking packets between private in a hardware ASIC. The ASIC includes an authentication 

networks 102a and 102fc. At each gateway between a private engine operable to authenticate a retrieved packet contem- 

network 102 and public network 104 is a firewall 110. The poraneously with the screening of the retrieved packet by the 

architecture for a conventional firewall is shown in no. li?. 65 firewall engine. The gateway includes a decryption/ 

Firewall 110 includes a public network link 120, private encryption engine for decrypting and encrypting retrieved 

network link 122 and memory controller 124 coupled by a packets. 
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The ASIC can include an internal rule memory for storing 
one or more rule sets used by the firewall engine for 
screening packets. The internal rule memory includes oft 
accessed rule sets while the external rule memory is con- 
figured to store lesser accessed rule sets. The internal rule 5 
memory includes a first portion of a rule set, and a second 
portion of the rule set is stored in the external rule memory. 
The memory can be a dual-port memory configured to 
support simultaneous access from each of the memory bus 
and the local bus. 

The gateway can include a direct memory access control- 
ler configured for controlling memory accesses by the 
firewall engine to the memory when using the local bus. 

In another aspect, the invention provides a rule set for use 
in a gateway. The gateway is operable to screen packets 
transferred over a network and includes a plurality of 
network interfaces, a memory, a memory controller and a 
firewall engine. Each network interface receives and for- 
wards messages from a network through the gateway. The 
memory is configured to temporarily store packets received 
from a network. The memory controller is coupled to each 
of the network interfaces and configured to coordinate the 
transfer of received packets to and from the memory using 
a memory bus. The firewall engine is coupled to the memory 
bus and operable to retrieve packets from the memory and 
screen each packet prior to forwarding a given packet 
through the gateway and out an appropriate network inter- 
face. The rule set includes a first and second portion of rules. 
The first portion of rules are stored in an internal rule 
memory directly accessible by the firewall engine. The 
second portion of rules are an expandable and stored in an 
external memory coupled by a bus to the firewall engine and 
are accessible by the firewall engine to screen packets in 
accordance with the retrieved rules. 

Aspects of the invention can include one or more of the 35 
following features. The rule set can include a counter rule. 
The counter rule includes a matching criteria, a count, a 
count threshold and an action. The count is incremented 
after each detected occurrence of a match between a packet 
and the matching criteria associated with the counter rule, 
When the count exceeds the count threshold the action is 
invoked. 

The first portion of rules can include a pointer to a 
location in the second portion of rules. The pointer can be in 
the form of a rule that includes both a pointer code and also 45 
an address in the external memory designating a next rule to 
evaluate when screening a current packet. The next rule to 
evaluate is included in the second portion of rules. 

In another aspect, the invention provides a gateway for 
screening packets received from a network and includes a 50 
plurality of ^n etwork int erf aces each for transmitting and 
receiving packets to and from a net work. The gat eway 
includesau luiegiated packet processor mcluding a separate 
firewall engine, authentication engine, and a direct memory 
access controller; a dual-port memory for storing packets. A 55 
memory bus is provided for coupling the network interfaces, 
the packet processor and the dual-port memory. A local bus 
couples the packet processor and the dual-port memory. The 
packet processor invokes the direct memory access control- 
ler to retrieve a packet directly from the dual-port memory eo 
using the local bus. A memory controller is included for 
controlling the transfer of packets from the network inter- 
faces to the dual-port memory. A processing unit extracts 
information from a packet and provides the information to 
the packet processor for processing. 65 

Aspects of the invention can include one or more of the 
following features. The integrated packet processor can 
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include a separate encryption/decryption engine for encrypt- 
ing and decrypting packets received by the gateway. 

The invention can include one or more of the following 
advantages. A local bus is provided for local access to 
memory from the firewall ASIC. The solution is imple- 
mented in hardware, easily handling dense traflSc that would 
have choked a conventional firewall. A combination firewall 
and VPN (virtual private network) solution is provided that 
includes a separate stand-alone firewall engine, encryption/ 
decryption engine and authentication engine. Each engine 
operates independently and exchanges data with the others. 
One engine can start processing data without waiting for 
other engines to finish all their processes. Parallel processing 
and pipelining are provided and deeply implemented into 
each engine and each module further enhancing the whole 
hardware solution. The high processing speed of hardware 
increases the throughput rate by a factor of ten. Other 
advantages and features will be apparent firom the following 
description and claims. 

BRIEF DESCRIPTION OF THE DRAWING 

FIG. la is a block diagram of a conventional packet 
switch commimication system. 

FIG. lb is a block diagram of conventional firewall 
device. 

FIG. 2 is a schematic block diagram of communication 
system including local bus and ASIC in accordance with the 
invention. 

FIG. 3 is a flow diagram for the flow of packets through 
the communication system of FIG. 2. 

FIG. 4 is a schematic block diagram of the ASIC of FIG. 

2. 

FIG. 5 illustrates a rule structure for use by the firewall 
engine. 

FIG. 6a is a flow diagram for a firewall screening process. 
FIG. 6b is an illustration of a pipeline for use in rule 
searching. 

FIG. 7 is a flow diagram for an encryption process. 
FIG. 8 is a flow diagram for an authentication process. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

Referring to FIG. 2, a communication system 200 
includes a public network link 120, private network link 122 
and memory controller 124 coupled by a bus 125. Commu- 
nication system 200 can be a gateway between two distinct 
networks, or distinct portions of a network. The gateway can 
bridge between trusted and untrusted portions of a network 
or provide a bridge between a public and private network. 
Each network link 120 and 122 can be an Ethernet link that 
includes an Ethernet media access controller (MAC) and 
Ethemet physical layer (PHI) for allowing the communica- 
tion system to receive/send packets from/to networks. A 
memory bus 129 couples a memory controller 124 to a 
dual-port memory 203 and an application specific integrated 
circuit (ASiq 204. Local bus 202 also links ASIC 204 to 
dual-port memory 203. Dual-port memory 203 can be a 
random access memory (RAM) with two separate ports. Any 
memory location can be accessed from the two ports in the 
same time. 

Associated with ASIC 204 is an off-chip rule memory 206 
for storing a portion of the software rules for screening 
packets. Local bus 202 couples rule memory 206 to ASIC 
204. Off-chip rule memory 206 can be a static RAM and is 
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used to store policy data. The structure and contents of the 
off-chip-memory is discussed in greater detail below. 

A central processor (CPU) 134 is coupled to memory 
controUer 124 by CPU bus 132. CPU 134 oversees the 
memory transfer operations on memory bus 129 and bus 
125. 

Referring now to FIGS. 2 and 3, a process 300 for 
screening packets is described in general. Packets are 
received at public network link 120 (302). Each packet is 
transferred on bus 125 to, and routed through, memory 
controller 124 and on to dual-port memory 203 via memory 
bus 129 (304). When ASIC 204 is available, the packet is 
fetched by ASIC 204 using local bus 202 (306). After 
processing by ASIC 204 (308), the packet is returned to 
RAM 126 using local bus 202 (310). The processing by 
ASIC 204 can include authentication, encryption, 
decryption, virtual private network (VPN) and firewall ser- 
vices. Finally, the packet is retrieved by memory controller 
124 using memory bus 129 (312), and routed to private 
network link 122 (314). 

Referring now to FIG. 4, the heart of the communications 
system is ASIC 204. ASIC 204 integrates a firewall engine, 
VPN engine and local bus direct memory access (DMA) 
engine in a single chip. ASIC 204 includes a firewall engine 
400, an encryption/decryption engine 402, an authentication 
engine 404, an authentication data buffer 406, a host inter- 
face 408, a local bus DMA engine 410, a local bus interface 
412 and on-chip rule memory 414. 

Host interface 408 provides a link between ASIC 204 and 
memory btis 129. Packets are received on host interface 408 
and processed by ASIC 204. 

Firewall engine 400 enforces an access control policy 
between two networks. Firewall engine utilizes rules stored 
in on-chip rule memory 414 and off-chip rule memory 206. 

A VPN module is provided that includes encryption/ 
decryption engine 402 and authentication engine 404. 

Encryption/decryption engine 402 performs encryption or 
decryption with one or more encryption/decryption algo- 
rithms. In one implementation, a data encryption standard 
(DES) or Triple-DES algorithm can be applied to transmit- 
ted data. Encryption assures confidentiality of data, protect- 
ing the data from passive attacks, such as interception, 
release of message contents and traffic analysis. 

Authentication engine 404 assures that a communication 
(packet) is authentic. In one implementation MD5 and 
SHAl algorithms are invoked to verify authentication of 
packets. Authentication buffer 406 is a temporary buffer for 
storing partial results generated by authentication engine 
404. The localized storage of partial results allows the 
authentication process to proceed without requiring the 
availability of the local bus or memory bus. The partial 
results can be temporarily stored in authentication buffer 406 
until the appropriate bus is free for transfers back to dual- 
port memory 203. 

Local bus DMA engine 410 facilitates access to dual-port 
memory 203 using local bus 202. As such, CPU 132 is freed 
to perform other tasks including the transfer of other packets 
into dual-port memory 203 using memory bus 129. 

There are two rule memories in the communication 
system, on-chip rule memory 414 inside ASIC 204, and 
off-chip rule memory 206, that is external to ASIC 204. 
From a functionality point of view, there is no difference 
between these two memories. The external memory enlarges 
the whole rule memory space. Rule searching can be imple- 
mented in a linear order with the internal rule memory first. 
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Of course, the searching process is faster when performed in 
the on-chip rule memory. The structure for the rules is 
described in greater detail below. 
A rule is a control policy for filtering incoming and 

5 outgoing packets. Rules specify actions to be applied as 
against a certain packet. When a packet is received for 
inspection (rule search), the packet's IP header (six 32-bit 
words), TCP header (six 32-bit words) or UDP header (two 
32-bit words) may require inspecting. A compact and effi- 

jQ cient rule structure is provided to handle all the needs of 
firewall engine 400. In one implementation, a minimal set of 
information is stored in a rule including the source/ 
destination IP addresses, UDP/TCP source/destination 
addresses and transport layer protocol. This makes the rule 

J 5 set compact, however sufficient for screening services. The 
stmcture 500 of a rule is shown in FIG. 5. Rules can includ e 
a source/destination IP address 502, 503, a UDP/TC P 
source/destination port 504^ ^05^ counter 506, sourc e/ 
destination IP address mask 508, transp ort layer protocol^ 

20 5 10, general mask (LiMAi>k) an. searcnmg control nel^ 
5 12 and a response action field 514. In nne emhnriimftnl^ 
each rule includes six 32-bit words. Reserved bits are set to - 
have a logical zero value. 
Searching control field 512 is used to control where to 

25 continue a search and when to search in the off-chip rule 
memory 206. In one implementation, searching control field 
512 is four bits in length including bits B31-B28. 

The rule set can contain two types of rules. In one 
implementation, the two rule types are distinguished by bit 

30 B31 of the first word in a rule. A logical zero value indicates 
a type "0** rule, referred to as a normal rule. A logical one 
value indicates . a type "1" rule. Type-1 rules are an address 
pointing to a starting location in the external rule memory at 
which point searching is to continue for a given packet. 

35 On-chip memory 414 includes spaces for many rules for 
handlmg the packet traific in to and out from different 
mtertaces ^sucn as, tt'OM a trusted mtertace (private network 
ihtertace 12U) to an untrusted interface rt?ublic network 
mtertace i^^)}^ It a rule set is too large to be contained in 

40 on-chip rule memory 414, a portion of the rule set can be 
placed in the on-chip memory 414 and the remainder placed 
in off-chip rule memory 206. When a rule set is divided and 
includes rules in both on and off-chip memories, the final 
mle contained in the on-chip memory 414 for the rule set is 

45 a type-1 rule. Note that this final rule is not to be confused 
with the last rule of a rule set described below. The final rule 
merely is a pointer to a next location at which searching is 
to continue. 

When firewall engine 400 reaches a rule that is identified 

50 as a type-1 rule (bit B31 is set to a logical one value), 
searching for the rule set continues in off-chip memory. The 
As engine uses the address provided in bits B0-B13 of the 
sixth word of the type-1 rule and continues searching in 
off-chip rule memory 206 at the address indicated. Bit B30 

55 is a last rule indicator. If bit B30 is set to a logical one value, 
then the rule is the last rule in a rule set. Rule match 
processes end after attempting to match this rule. Bit B29 is 
a rule set indicator. When bit B2 9 is set to a logical one 
value, the mle match process will not stop when the packet 

60 matches the rule. When bit B29 is set to a logical zero value, 
the rule match process stops when the packet matches the 
rule. Note that this bit applies only when bit B2 is set. When 
bit B2 is set to a logical zero value, regardless of the value 
of this bit B29, the rule match process always stops when a 

65 match is found. The value and use of bit B2 is discussed in 
greater detail below. In the implementation described, bit 
B28 is reserved. 
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The source/destination IP address 502, 503 defines a 
source and a destination address that is used as a matching 
criterion. To match a rule, a packet must have come from the 
defined source IP address and its destination must be the 
defined destination IP address. 

The UDP/rCP source/destination port 504, 505 specifies 
what client or server process the packet originates from on 
the source machine. Firewall engine 400 can be configured 
to permit or deny a packet based on these port numbers. In 
one implementation, the rule does not include the actual 
TCPAJDP port, but rather a range for the port. A port opcode 
(PTOP) can be included for further distinguishing if a match 
condition requires the actual TCP/UDP port falls inside or 
outside the range. This is very powerful and allows for a 
group of ports to match a single rule. In one implementation, 
the range is defined using a high and low port value. In one 
implementation, bit B26 is used to designate a source port 
opcode match criterion. When the B26 bit is set to a logical 
zero, the packet source port must be greater than or equal to 
the source port low and less than or equal to the source port 
high in order to achieve a match. When the B26 bit is set to 
a logical one value, the packet source port must be less than 
the source port low or greater than the source port high. 
Similarly, the B27 bit is used to designate a destination port 
opcode match criterion. When bit B27 is set to a logical zero 
value, the packet destination port must be greater than or 
equal to the destination port low and less than or equal to the 
destination port high in order to achieve a match. Again, a 
one value indicates that the packet destination port should be 
less than the destination port low value or greater than the ^0 
destination port high value to achieve a match for the rule. 

Counter 506 is a high performance hardware counter. 
Counter 506 records a number of times that a particular rule 
has matched and is updated after each match is determined. 
In one implementation, at a defined counter threshold, 
counter 506 can trigger firewall engine 400 to take certain 
actions. In one implementation, the defined threshold for the 
counter is predefined. When the counter reaches the thresh- 
old value, a register bit is set. Software can monitor the 
register and trigger certain actions, such as deny, log and 
alarm. When a rule is created, an initial value can be written 
into the counter field. The difference between the initial 
value and the hardware predefined threshold determines the 
actual threshold. Generally speaking, the hardware ASIC 
provides a counting mechanism to allow for the software 
exercise of actions responsive to the count. 

Source/destination IP address mask 508 allows for the 
masking of less significant bits of an IP address during IP 
address checking. This allows a destination to receive pack- 
ets from a group of sources or allow a source to broadcast 
packets to a group of destinations. In one implementation, 
two masks are provided: an Internet protocol source address 
(IPSA) mask and an Internet protocol destination address 
(IPDA) mask. 

The IPSA mask can be five bits in length and be encoded 
as follows: 00000, no bits are masked (all 32-bits are to be 
compared); 00001, bit "0" of the source IP address is masked 
(bit "0" is a DON't CARE when matching the rule); 00010, 
bit 1 and bit 0 are masked; 01010, the least 10 bits are 
masked; and 11111, only bit 31 (the MSB) is not masked. 
The IPDA mask is configured similar to the IPSA mask and 
has the same coding, except that the mask applies to the 
destination IP address. 

Transport layer protocol 510 specifies which protocol 
above the IP layer (TCP, UDP, etc.) the policy rule is to be 
enforced against. In one implementation, transport layer 
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protocol field 510 is an 8 -bit field. For a rule match to arise, 
the transport layer protocol field 510 must match the packet 
IP header protocol field. However, if the B6 bit is set to a 
logical one, the transport layer protocol field is disregarded 
(a DON'T CARE as described above). GMASK field 512 
indicates to firewall engine 400 whether to ignore or check 
the packet's source IP address, destination IP address, pro- 
tocol or packet acknowledgment or reset bits. Other masks 
can also be included. In one implementation, the GMASK 
includes four bits designated B4-B7. When the B4 bit is set 
to a logical one, the packet source IP address is disregarded 
when matching the rule (source IP address comparison result 
will not be considered when determining whether or not the 
packet matches the rule). When the B5 bit is set to a logical 
one, the packet destination IP address is disregarded when 
matching the rule (destination IP address comparison result 
will not be considered when determining whether or not the 
packet matches the rule). When the B6 bit is set to a logical 
one, the packet protocol field is disregarded when matching 
the rule (packet protocol field comparison result will not be 
considered when determining whether or not the packet 
matches the rule). Finally, when the B7 bit is set to a logical 
one, both the packet acknowledge (ACK) bit and reset bit 
are disregarded when matching the rule. When the B7 bit is 
set to a logical zero, the packet ACK bit and/or reset bit must 
be set (to a logical one value) for a match to arise. 

Response action field 514 can be used to designate an 
action when a rule match is detected. Examples of actions 
include permit/deny , alarm and l ogging. In one 
implementation, response action field 514 is four bits in 
length including bits BO to B3. In one implementation, the 
BO bit is used to indicate a permit or deny action. A logical 
one indicates that the packet should be permitted if a match 
to this rule occurs. A logical zero indicates that the packet 
should be denied. The Bl bit is used as an alarm indication. 
A logical one indicates that an alarm should be sent if the 
packet matches the particiilar rule. If the bit is not set, then 
no alarm is provided. Alarms are used to indicate a possible 
security attack or an improper usage. Rules may be included 
with alarm settings to provide a measure of network security. 
When a match occurs, an alarm bit can be set in a status 
register (described below) to indicate to the CPU that the 
alarm condition has been satisfied. Depending on the num- 
ber or kinds of alarms, the CPU can implement various 
control mechanisms to safeguard the communications net- 
work. 

The B2 bit can be used to indicate a counter rule. A logical 
one indicates that the rule is a counter rule. For a counter 
rule, the least 24 bits of the second word of the rule are a 
counter (otherwise, the least 24 bits are reserved for a 
non-coxmter rule). The counter increments whenever a 
packet matches the mle. A counter rule can include two 
types: a counter-only rule and accumulate (ACL) rule with 
counter enabled. When matching a counter only rule, the 
count is incremented but searching continues at a next rule 
in the rule set. When matching a ACL rule with counter 
enabled, the counter is incremented and searching termi- 
nates at the rule. The B3 bit is a log indication. A logical one 
indicates that the packet information should be logged if a 
match arises. 

Referring now to RGS. 2, 4 and 6a, a process 600 
executed by firewall engine 400 is shown for screening 
packets using both the on-chip and off-chip rule memories. 
The firewall engine process begins at step 602. A packet is 
received at an interface (public network interface 122) and 
transferred to dual -ported memory 203 using a DMA pro- 
cess executed by memory controller 124 (604). 
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CPU 134 reads packet header information from packet comparison is performed. Finally, during the third clock 

memory, then writes the packet information into special cycle, a TCPAJDP port comparison is performed. Each of 

registers on ASIC 204 (606), These registers are mapped these 3 steps are independent sub-processes of a rule search, 

onto the system memory space, so CPU 134 has direct A pipe hne is then applied to the rule search process. FIG. 6f) 
access to them. In one implementation the registers include: 5 illustrates the pipeline design. When a rule search starts, the 
a source IP register, for storing the packet source IP address; information is fetched in the 1st clock cycle. In the 

a destination IP register, for storing the packet destination IP ^nd clock cycle, the IP address of the current packet is 

address; a port register, for storing the TCP/UDP source and cojnpared with the rule. At the same clock cycle, the 2nd mle 

destinaUon ports; a protocol register for storing the transport information is fetched, that is the 2nd rule search starts. The 
layer protocol; and an acknowledge (ACK) register for lO Process contmues in this manrier until the search is com- 

storing the ACK bit from the packet, P*^!^^; ^ ^f!^,^ ^ ,^^^^y ^^^^^ ^y^^^ not includmg the 

- .„ .... 3-clock latency. If the pipehne was not used, the rule search 

CPU 134 also specifies which m e set to search by writing ^^^^ ^^^^ ^ 

to a ru e set specifier register (608). n one implementation. ,^ ^ encryption/ 

a plurality of rule sets are Stored m rule memory, each ha vmg , ^. * . u a i • !i * 

, , , . ,1 , ir decryption process 700 IS shown. A packet is received at a 

a startmg address. In one implementation, two rule sets are ->5 . jt^»xa,j. ^ . /j i 

., .| J ^ . , J * . network mteriace and DMA d to packet memory (dual-port 

available and two registers are used to store the starting n a\m -^ms /nM\ *u i * • j & .u c n 

r u 1 . T-i A- *u 1 •** RAM 203) (702). If the packet IS permitted after the firewall 

addresses of each rule set. Dependmg on the value written . . . ^ , . , . 

^ . , , -c .t. L * . • . J inspection (704) and encryption or decryption is needed 

to the rule set specifier, the searchmg begms at the appointed /-a^-x \l S . . ina 

set (706), then the process contmues at step 708. 

. . , ^ „ . ^/wv . 20 In step 708, CPU 134 writes information needed by the 

CPU 134 issues a command to firewaU engine 400 by encryption/decryption engine 402 into special registers on 

writing to a control registe^ ASIC 204. In one implementation, the special registers 

(610). Firewall engine 400 compares the contents of the ^^^j^^^. ^^^^ ^^^^^^^ ^^^^ ^ ^ 

special reg^ters to each rule in sequence (611) unUl a match ^ encryption/decryption engine 402; initial vector (IV) 

fjT'^c 1^^^^- '^^'''^ ^^'^J'.^^f'^ u 25 registers, for storing the initial vectors used by encryption/ 

(613). If the match IS to a counter rule (614), then the count decryption engine 402; a DMA source address register, for 

IS incremented (615) and the search continues (back at step ^^^^ ^^^^^^ ^ ^^^^ ^ ^y^^^^ 

612). If the counter threshold is exceeded or if the search ^^^^^ ^ destination address register, for 

locates a match (non-counter malch), the search results are ^^^^ ^^^^ ^^^^^^ ^ ^^^^ ^ ^^^^^ 

written to a status register (616)|(ln one miplementation, the ^PU 134 can find the encryption/decryption results; and a 

status register mcludes ten bits including: a search done bit ^MA count register, for indicating how many words of the 
mdicatmg a search is fimshed; a match bit indicatmg a match ^^^^^ ^^^^ ^ encrypted or decrypted. CPU 134 issues 

has been found; a busy bit mdicating (when set) that the ^ ^^^^^^^ ^^^^ encryption or decryption operation 

firewaU engme is performing a search; and error bit mdi - ^^^^y implementation, this is accomphshed by 

catm^ an error occurred durmg the search; a permij^enybi^ ^^^^ ^^^^ ^^^^^^ Encryption/decryption^-v 

to^signal the firewall to permit or deny the mspect^packet; ^ determines which o"?i7ation to invoke ^ 

an alarm bit to signal the firewall it an alarm neeos to^e (encryption or decryption) (712). Keys for the appropriate 
•raised; a log b t to signal the lirew^ it ttie pacKet Se^o ^^^^^^ ^^^^-^^^^ ^^^^ ^ ^ i^^^^ (7^4) / 

be logged ; a VPN bit to signal the system^ t¥e pl^et needs Encryption/decryption engine 402 uses the keys to encrypt/ 

VPN processmg; a counter rule address bit to store the ^ ^^^^^ ^^^^ ^ ^^^^^^ ^^^^^^^ ^^^^^^^^ ^ 

matched counter rule address; and a counter fiill bit for j^^^ ^^^^^ ^^^^^^ ^y^^y implementation, 

^mdicatmg the counter has reached a threshold. encryption/decryption engine 402 uses DMA block transfers V 

While firewall engme 400 is doing a search, CPU 134 to retrieve portions of the packet from dual-port memory 

polls the status register to check whether the engine is busy 203. As each block is encrypted/decrypted, the results are / 

or has finished the search (618). When the CPU 134 deter - transferred back to the dual-port memory 203 (718). Again, / 
mines the search is complete, CPU 134 executes certain^ DMA block data transfers can be used to write blocks of data / 
actions against the current packet based on the mformaTioq' back to dual-port memory 203 starting at the address indi- I 

i n the status register, such as permit or deny me pacKcjr cated by the DMA destination register. The encryption/ \ 

sVnal a alarm and lop the packet (62U). decryption engine also writes a busy signal into a DES status | 

The search may find.no match and if so, the packet can be 50 register to indicate to the system that the encryption/ / 

discarded. If the packet is permitted, other operations like decryption engine is operating on a packet. ^ ^ 

encryption/decryption or authentication can be performed When encryption/decryption engine 402 completes a job 

on the packet as required. Wh en all of the required opera- (720), the engine indicates the success or failure by writing 

t ions are completed, the packcT can be tfanSMltted through a bit in DES status register (722). In one implementation, the 
a network mtertace (private network iriierface 12 0). After 55 DES status register includes a DES done bit, for indicating 

tne appropnate action has been invoked, the process ends that the engine has finished encryption or decryption; and a 

(622). DES error bit, indicating that an error has occurred in the 

To speed the rule search process, a pipelining methodol- encryption/decryption process, 
ogy is included in ASIC 204. A pipeline is a common design CPU 134 polls the DES status register to check if the 

methodology that is deeply implemented in the ASIC 60 encryption/decryption engine has completed the job. When 

design. Basically, a lengthy process is chopped into many the DES status register indicates the job is complete, CPU 

independent, sub-processes in a sequence. A new process 134 can access the results starting at the address indicated by 

can be started without waiting for a previously invoked the DMA destination address register. At this point, the 

process to finish. In firewall engine 400, a rule search is encrypted/decrypted data is available for further processing 
completed in 3 clock cycles using a pipeline process. During 65 by CPU 134, which in turn builds a new packet for transfer 

the first clock cycle, rule information is fetched from rule through a network interface (726). Thereafter the process 

memory. During the second clock cycle, an IP address ends (728). 
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Referring now to FIGS. 2, 4 and 8, a process 800 for 
authenticating packets is shown. The process begins after a 
packet is received at a network interface and DMA'ed to 
dual-port memory 203 (802). If the packet is permitted (804) 
after the firewall inspection (803) and authentication is 
needed (806), the following operations are performed. Else 
the packet is dropped and the process ends (830). 

An authentication algorithm is selected (808). In one 
implementation, two authentication algorithms (MD5 and 
SHAl) are included in authentication engine 404. Both the 
MD5 and SHAl algorithms operate in a similar manner and 
can share some registers on AlSIC 204. Only one is required 
for authentication of a packet. As an example, a MD5 
authentication process is described below. The SHAl pro- 
cess is similar for the purposes of this disclosure. 

CPU 134 writes related information into MD5 related 
registers on ASIC 204 (810). In one implementation, ASIC 
204 includes a plurality of MD5 registers for supporting the 
authentication process including: MD5 state registers, for 
storing the initial values used by the MD5 authentication 
algorithm; a packet base register, for storing the starting 
address of the message to be processed; a packet length 
register, for storing the length of the message to be pro- 
cessed; a MD5 control register, for signaling the availability 
of a packet for processing; and a MD5 status register. 

CPU 134 issues a command to start the MD5 process 
(811) by writing to the MD5 control register (812). The 
authentication engine 404 begins the process by writing a 
busy signal to the MD5 status register to let CPU 134 know 
the authentication engine is processing a request 
(authenticating a packet). Authentication engine 404 pro- 
cesses the packet (813) and places the digest result into the 
MD5 state registers (814). When the job is complete (815), 
authentication engine 404 signals the completion by setting 
one or more bits in the MD5 status register (816). In one 
implementation, two bits are used: a MD5 done bit, indi- 
cating authentication engine 404 has finished the authenti- 
cation process; and a MD5 error bit, indicating that an error 
occurred. CPU 134 polls the MD5 status register to deter- 
mine if the authentication job is complete (817). When the 
MD5 done bit is set, CPU 134 reads out the digest results 
from the MD5 state registers (818). Thereafter, the process 
ends (830). 

In one implementation, parallel processing can be per- 
formed in ASIC 204. For example, the MD5 or SHAl 
authentication process can be intervened with the 
encryption/decryption process. When receiving a packet, 
ASIC 204 initiates an encryption (DES or Triple-DBS) 
process on a packet. After a couple clock cycles, ASIC 204 
can start the authentication process (MD5 or SHAl) without 
interrupting the encryption process. The two processes pro- 
ceed in the same time period and finish in almost the same 
time. This can reduce the overall process time in half. 

More specifically, after a packet is transferred into the 
dual-port memory 203, it can be fetched by ASIC 204 using 
local bus 202. The encryption/decryption engine 402 can be 
invoked, and after several clock cycles, authentication, using 
authentication engine 404, can start for the same packet. The 
two engines work in an intervening manner without sacri- 
ficing each engine's performance. In one implementation, 
the other possible combinations for parallel processing 
include: DES Encrypti6n+MD5 authentication, MD5 
authentication+DES decryption. Triple DES Encryption+ 
MD5 authentication, MD5 authentication+Triple DES 
decryption, DES Encryption+SHAl authentication, SHAl 
authentication+DES decryption. Triple DES Encryption+ 
SHAl authentication and SHAl authentication+Triple DES 
Decryption. 



)1,432 Bl 

12 

Packet flow through each engine can be in blocks or on a 
word by word basis. In one implementation, the packet data 
is grouped in a block and U-ansferred in blocks using the 
local bus and memory bus. 
5 The present invention has been described in terms of 
specific embodiments, which are illustrative of the invention 
and not to be construed as limiting. Other embodiments are 
within the scope of the following claims. 
What is claimed is: 
jQ 1. A gateway for screening packets transferred over a 
network, the gateway including a plurality of network 
interfaces, each receiving and forwarding messages from a 
network through the gateway, a memory for temporarily 
storing packets received from a network, and a memory 
J J controller coupled to each of the network interfaces and 
configured to coordinate the transfer of received packets to 
and from the memory, the gateway including: 

a memory bus for transferring the received packets to and 
from the memory, the memory bus providing a first 
2Q path for retrieving packets from the memory including 
a first portion of a rule set, where one or more oft 
accessed rule sets are stored; 
a firewall engine coupled to the memory bus, the firewall 
engine operable to retrieve packets from the memory 
and screen each packet prior to forwarding a given 
packet through the gateway and out an appropriate 
network interface; 
a local bus coupled between the firewall engine and the 
memory providing a second separate non -overlapping 
path for retrieving packets to and from the memory; 
30 and 

an expandable external rule memory configured to store 
lesser accessed rule sets and coupled to the local bus, 
the external rule memory including a second portion of 
the rule set accessible by the firewall engine using the 
35 local bus, wherein the firewall engine is operable to 
retrieve rules from the second portion of the rule set and 
screen packets in accordance with the retrieved rules. 
2. The gateway of claim 1 wherein the firewall engine is 
implemented in a hardware ASIC, 
40 3. The gateway of claim 2 wherein the ASIC includes an 
authentication engine operable to authenticate a retrieved 
packet contemporaneously with the screening of the 
retrieved packet by the firewall engine, 

4. The gateway of claim 3 further including a decryption/ 
45 encryption engine for decrypting and encrypting retrieved 

packets. 

5. The gateway of claim 1 wherein the memory is a 
dual-port memory configured to support simultaneous read 
or write access from each of the memory bus and the local 

50 bus. 

6. The gateway of claim 1 further including a direct 
memory access controller configured for controlling 
memory accesses by the firewall engine to the memory when 
using the local bus. 

55 7. In a gateway for screening packets transferred over a 
network, where the gateway includes a plurality of network 
interfaces, each receiving and forwarding messages from a 
network through the gateway, a memory for temporarily 
storing packets received from a network, a memory control- 

60 ler coupled to each of the network interfaces and configm^ed 
to coordinate the transfer of received packets to and from the 
memory using a memory bus, and a firewall engine coupled 
to the memory bus where the firewall engine is operable to 
retrieve packets from the memory and screen each packet 

65 prior to forwarding a given packet through the gateway and 
out an appropriate network interface, a rule set for use by the 
firewall engine in screening packets comprising:" 
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a first portion of rules stored on an ASIC in an internal rule 
memory directly accessible by the firewall engine and 
representing a first portion of a rule space, the rule 
space defining one or more control policies for filtering 
incoming and outgoing packets; and 

an expandable second portion of rules not stored on the 
ASIC, which enlarges the rule memory space providing 
additional rule memory to the gateway by storing the 
second portion of rules in an external memory, the 
expandable second portion of rules coupled by a bus to 
the firewall engine and accessible by the firewall engine 
to screen packets in accordance with the retrieved rules; 

where the first portion of rules includes a pointer to a 
location in the expandable second portion of rules, 
where the pointer is in the form of a rule that includes 
both a pointer code and also an address in the external 
memory designating a next rule to evaluate when 
screening a current packet and where the next rule to 
evaluate is included in the second portion of rules. 

8. In a gateway for screening packets transferred over a 
network, where the gateway includes a plurality of network 
interfaces, each receiving and forwarding messages from a 
network through the gateway, a memory for temporarily 
storing packets received from a network, a memory control- 
ler coupled to each of the network interfaces and configured 
to coordinate the transfer of received packets to and from the 
memory using a memory bus, and a firewall engine coupled 
to the memory bus where the firewall engine is operable to 
retrieve packets from the memory and screen each packet 
prior to forwarding a given packet through the gateway and 
out an appropriate network interface, a rule set for use by the 
firewall engine in screening packets comprising: 

a first portion of rules stored in an internal rule memory 
directly accessible by the firewall engine; and 

an expandable second portion of rules stored in an exter- 
nal memory coupled by a bus to the firewall engine and 
accessible by the firewall engine to screen packets in 
accordance with the retrieved rules, 

wherein the rule set includes a counter rule, the counter 
rule including a matching criteria, a count, a count 
threshold and an action, the count incremented after 
each detected occurrence of a match between a packet 
and the matching criteria associated with the counter 
rule, such that when the count exceeds the count 
threshold the action is invoked. 

9. The rule set of claim 7, wherein the first portion of rules 
includes a pointer to a location in the second portion of rules, 
where the pointer is in the form of a rule that includes both 
a pointer code and also an address in the external memory 
designating a next rule to evaluate when screening a current 
packet and where the next rule to evaluate is included in the 
second portion of rules. 

10. A gateway for screening packets received from a 
network including: 

a plurality of network interfaces each for transmitting and 
receiving packets to and from a network; 

an integrated packet processor including at least two 
processing engines and a direct memory access 
controller, where the at least two processing engines 
include a firewall engine and an authentication engine; 

a dual-port memory for storing packets and a first portion 
of rules, including oft accessed rule sets used by the 
firewall engine for screening the packets; 

an external rule memory including an expandable second 
portion of rules where the external rule memory is 
configured to store lesser accessed rule sets, where the 
first portion of rules includes a pointer to a location in 
the expandable second portion of rules, where the 
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pointer is in the form of a rule that includes both a 
pointer code and also an address in the external 
memory designating a next rule to evaluate when 
screening a current packet and where the next rule to 
5 evaluate is included in the second portion of rules; 

a memory bus for coupling the network interfaces, the 
packet processor and the dual-port memory; 

a local bus separately coupling the packet processor, the 
dual-port memory and the external memory, the packet 
10 processor invoking the direct memory access controller 
to retrieve a packet directly from the dual-port memory 
using the local bus; 

a memory controller for controlling a transfer of packets 
from the network interfaces to the dual-port memory; 
15 and 

a processing unit for extracting information from a packet 
and providing the information to the packet processor 
for processing. 

11. The gateway of claim 10 wherein the integrated packet 
20 processor includes a separate encryption/decryption engine 

for encrypting and decrypting packets received by the gate- 
way. 

12. A method for screening packets transferred over a 
network, comprising: 

25 providing a firewall engine coupled directly to both a 
primary memory bus and a local memory bus, where 
the primary memory b\is and the local memory bus are 
separate and non-overlapping; and 
retrieving packets from a memory and sending the packets 

30 to the firewall engine using the primary memory bus, if 
the primary memory bus is available, otherwise, using 
the local memory bus. 

13. The gateway of claim 7, wherein: 

the internal rule memory is located on an application 
35 specific integrated circuit. 

14. The gateway of claim 13, wherein: 

the external memory is not located on the application 
specific integrated circuit, but is associated with the 
application specific integrated circuit. 
40 15. The gateway of claim 1, wherein: 

the local bus provides a second path for retrieving packets 

from the memory when the memory bus is busy. 
16. A gateway for screening packets transferred over a 
network, the gateway including a plurality of network 
45 interfaces, each receiving and forwarding messages from a 
network through the gateway, a memory for temporarily 
storing packets received from a network, and a memory 
controller coupled to each of the network interfaces and 
configured to coordinate the transfer of received packets to 
50 and from the memory, the gateway including: 

a memory bus for transferring the received packets to and 
from the memory, the memory bus providing a first 
path for retrieving packets from the memory; 
a firewall engine coupled to the memory bus, the firewall 
55 engine operable to retrieve packets from the memory 
and screen each packet prior to forwarding a given 
packet through the gateway and out an appropriate 
network interface; and 
a local bus coupled between the firewall engine and the 
60 memory providing a separate second path for the 
firewall engine to retrieve packets from the memory; 
where the firewall engine accesses the memory using the 
memory bus when the local bus is busy or the local bus 
when the memory bus is busy. 
65 17. A gateway for screening packets comprising: 

a first portion of rules stored on an ASIC in an internal rule 
memory directly accessible by the firewall engine and 
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representing a first portion of the rule space, the rule 
space defining one or more control policies for filtering 
incoming and outgoing packets; and 
an expandable second portion of rules not stored on the 
ASIC, which enlarges the rule memory space providing 
additional rule memory to the gateway by storing the 
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second portion of rules in an external memory, the 
expandable second portion of rules coupled by a bus to 
the firewall engine and accessible by the firewall engine 
to screen packets in accordance with the retrieved rules. 

« « « « * 
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